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indicate that TFIIB \/i11 lie on the opposite face to TFIJA in the 
preinitiation complex . 

TFIIB exists in Bdlution as a monomer 22 and is presumed 
to bind the TBP-DNA complex as a monomer, but the exact 
stoichiomeuy of TFIIB in the TFIIB-TBP complex is not yet 
known. However, a monomer of the size of the C-terminal 
domain of TFIIB could interact with both ends of the' bent 
DNA. For comparison, the conserved domain of TBP shown in 
Fig. 3 is 180 amino ecids and the C-terminal domain of TFHB 
is 208 amino acids long. 

The sugars protec ed by TFIIB both upstream and down- 
stream of the TATA box border the minor groove in the TBP- 
DNA model. This si-ggests that TFIIB; like TBP, interacts at 
feast in part with DNA in the minor groove. Consistent with 
this prediction, TFIT3 binding did not protect any G residues 
from methytation by dimethyl sulphate at the G + C-rich Ad- 



MLP (data not shown). TFIIB does not appear to maite essential 
base-specific contracts with the major groove of the TATA box, 
as both TBP 12 and TBP-TFIIB form stable compters* with a 
(d)-dQ-substituted MLPTATA box (data not show i). 

Our model for the binding o f TFI IB to TBP-D> : A revises 
previous models. Most describe TFTIB interactions vith DNA 
only downstream of the TATA box 1 * 3 **. Our results snow that, 
in addition to the C-terminal stirrup of TBP, a critical element 
of the TFIIB target is DNA bent in a specific orientation. It is 
unlikely that a monomer the size of TFIIB couid infciract with 
DNA on both sides of the TATA box if the DNA ii not bent 
closer on itself by TBP. TBP may need to bend DNa when it 
binds in order to support the assembly of prrinitiation 
complexes. TBP bound to distorted DNA is likely to be an 
important target for other components or the pntoutiafion 
complex ,l3 . q 
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The DNA polymerase Irom Thermus aquaticus {Taq polymerase), 
famous for its use in the polymerase chain reaction, is homologous 
to Escherichia coli DNA polymerase I (pol I) (ret 1). like poJ I, 
Taq polymerase has a domain at its amino terminus (residues 
1-290) that has 5' nuclease activity and ■ domain at its caiboxy 
tcrmhiDS that catalyses the polymerase reaction. Unlike pol I, the 
intervening domain in Taq polymerase has lost the editing 3'-S' 
exonnclease activity, ^though the structure of the Kleaovr frag- 
ment of pol I has been known for ten years 2 , that of the intact 
pol I has proved more i^nsive, The structure of Taq polymerase 
determined here at 2.4 k resolution shows that the structures of 
the polymerase domain* of the thermostable enzyme and of the 
Ktenow fragment are nearly Identical, whereas the catalytic ally 
critical carhoxylate residues that bind two metal Ions are mfasrng 
from the remnants of Jie 3'-5'-exonnclease active site of Taq 
polymerase. The first visw or the 5' nuclease domain, responsible 
for excising the Okazak RNA in lagging-strand DNA replication, 
shows a cluster of conserved divalent metaMon-biading earboxvl- 
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ntes at the bottom of a cleft. The location of this 5'-oacteae active 
site some 70 A from the polymerase active site m tins oyttal form 
btghH&hD the unanswered question of how thb domain works m 
concert with the polymerase domain to produce a dupiax DNA 
prod »c< that contains only a nick, 

Crystals of inlact Taq polymerase (Table 1) diffract to 2.4 A 
resolution at -1 65 °C using synchrotron radiation. The . .tructure 
was solved initially from a 3.3. A-resolution electron-density 
map, phased by multiple heavy-atom isomorphous r^acement 
and improved by solvent flattening using a manual!*' drawn 
envelope (Fig. 1 and Table I). Although tho poiymcras^domain 
shows a 51% amino-acid sequence identity with that of pol I 
(ref. 1). knowledge of the Klenow fragment (KF) structure did 
not help in the early stages of phasing, because evenaais con- 
served portion contributed too small a fraction to tfca X-ray 
scattering. The coordinates of Taq polymerase have bxo par- 
tially refined to an ^-factor of 22.9*/a (H^32.2%) % wi ij r.m.s. 
bond and angle deviations of 0.011 A and 1.79°,. respectively, 
for all data between 10 and 2.4 A resolution. The tip of the 
4 thumb' in the polymerase domain is disordered and tiere are 
several regions in the 5' nuclease domain where the Jtectron 
density is discontinuous, presumably because of disordering of 
loops: these regions include residues 12-13, 151- ? 72 and 
199-202. Furthermore, the residues from 172 to 233 * re built 
here as polyaianine, again for reasons of poorly ordered sfectron 
density. The strung-out arrangement of the three domains in 
Taq polymerase results in an unusually elongated moIcCate that 
is 130 A long in this crystal form (Ftg. 2a} r 

Comparison of the structure of KF with the corres}tondkng 
parts of the Taq polymerase structure shows, as expecthd from 
the sequence comparisons, that the polymerase domains ire very 
nearly identical, whereas the 3'-5' exonuclease domaim differ- 
extensively (Fig. 2). Least-squares superposition, of 353 of -407 
corresponding o-carbon atoms in the polymerase domains 
resulted in an r.m^, duTerence of 1.2 A. By contrast, o>ily 101 
of 194 arcarbon atoms m the KF 3^-5' exonuclease domaiti coutd 
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TABLE 1 Experimental X-ray data and heavy-atom refinement 





Resolution 


Completion 


No. unique 






PhesingtS 


Mean figure 




(A> 


(%) 


reflections 


(%) 


(%) 


power 


of merit 


Native 1 


2.6 


80.0 


28,179 


6.7 






0.686 


Native II 


2.4 


90.0 


40.493 


4.6 








lOmMTMLAI 


2.5 


80.0 


31,443 


6.9 


13.4 


1.0 




1 mM CH^HgCl 


4.0 


99.4 


9,814 


6.9 


8.7 


0.85 




Sat baker's dfmercurtal 


3.0 


95*5 


22,127 


11.4 


11.3 


0.54 




1 mM KaPta 4 


3,3 


97 J. 


17,447 


11.8 


27.3 


0.53 




1 mM KiPtU 


4.0 


70.2 


6,987 


17.6 


39.7 


0.66 




1 mM Ch 3 HgCI/10 mM TMLA| 


3.3 


92.0 


16,541 


11.8 


24.9 


1.10 





Additional data seta were collected that extended to between 4 end 5 A resolution for all heavy-atom derivatives (data not shown), and the :r 
phasing information was combined with the data shown In .the table. These supplementary data produced a map wfth a better-defined solve it 
boundary and an Improved connectivity of the main-chain backbone. Crystals of Taq polymerase were grown at 22 °C In hanging drops contelnii -g 
3 |il protein solution (10 mg ml~ : Taq polymerase In 18 mM Trts-HCI, pH 8.2, 0.09 mM EDTA, 0.9 mM OTT. 90 mM KCU 9% (v/v) glycerol and 0.7 * 
(w/v) £-octyl glucoslde) and 3ul reservoir solution (15% (w/v) PEG8000, 60 mM ammonium sulphate, 2 mM DTT, 0.2% (w/v) sodium azide ar d 
100 mM sodium citrate, pH 5.5> a . The crystals belong to space group P3 A 21 and have unit cell dimensions of e = b = 108,0 A, c« 171.2 A, a = 
/f-90 e and y=120°. The presence of one molecule per asymmetric unit gfves e crystal volume per protein mass (v*«) of 3.54 A 3 per dalton ai d 
a solvent content of 65% by vol jme 23 H In order to make heavy-atom derivatives with mercurials, wild-type Taq polymerase (no cysteine residue?!) 
was mutated by site-directed metagenesis to introduce three consecutive cysteines at positions 575 to 677. Crystals of this protein were used I x 
all native and derivative data sea. Crystals were flash-frozen at -165 °C attar first transferring to a stabilizing solution containing, 40 mM sodluTi 
citrate, pH 5.5, 10% (v/v) glycerol, 100 mM KC1, 0.4% (w/v) 0-octyl glucoslde and 31% (w/v) PEG8000 for 24 h, and then to a second stabilize 
solution containing 20 mM HEP 55, pH 7.4, 10% (v/v) glycerol. 100 mM KCI, 0.4% (w/v) l-octyi glucoslde and 33% (w/v) PEG8000 for 36 h, A 
2.6-A native date set collected ;>t BL-6A2 of the Photon Factory was used In the initial survey for heavy-atom derivatives, but not subsequent •'. 
Native I and four derivative data sets were collected at the CHESS Al beam line (A- 0.908 A) equipped with a CCD camera detector. The natf"e 
II data set was collected at the X12C beam line of the National Synchrotron Light Source (NSLS) at the Brookhaven National Laboratory (A =1.00 A 
equipped with the MAR X-ray drtector system). Two derivative data sets were collected on an R-AXIS I1C X-ray detector system mounted on a 
Rtgaku-200 rotating anode. All data were reduced using DEN20 and scaled using SCALEPACK (programs written by Z. Otwlnowskl). The position 
the Taq polymerase In the crystE I was solved by molecular replacement at 4 A resolution using a model of the polymerase domain that contalnrd 
45% of the scattering mass of Tuq polymerases and was based on the structure of Klenow fragment refined at 2.5 A resolution (J. Jaeger, 0. CahU 
and TAS., unpublished result). The rotation function search was done using MERLOT 23 - the Patterson correlation refinement and the translatic-n 
function In X-PLOR 34 , Phases ca ciliated from the model allowed location of bound heavy atoms by difference-Fourier syntheses at 4 A resolutk n 
but were not used directly in the stucture determination. An MIR electron density map calculated at 3.3 A resolution using these derivatives reflnrd 
wRh ML-PHARE 28 was Improved by sofvent flattening, histogram matching and phase combination using SQUASH**. A polyalanlne model fitted o 
this map allowed computation of a molecular envelope around the model with a 5 A radius for each atom using the program 0 (ref. 27). Furth v 
solvent flattening with this manually drawn envelope using SQUASH and Interactive heavy-atom refinement against flattened phases Improved U<e 
map quality. Refinement of the itructure bum into this map r including simulated annealing, position refinement and manual model rebuilding wne 
done against data from 10 to 2.4 A resolution. The present structure Includes 776 amino acids, one p-octyl glucoslde, arid 297 water molecule*. 
Fifty-six amino acids are not vJstole and sixty are modelled by polyalanlne. 

* R 0Trt =Z 1 1 - (01 /U where I is the observed intensity and </> la the average Intensity from multiple measurements. 

1Rm-ZIFp»\-\F p I/Z\F p \ 

t Data beyond 33 A resolution were not Included In the phase refinement of heavy-atom statistics for any derivatives. 
i Phasing power, f H /s =r.rrxs. (F H )/r.m.s, (lack of closure), where Fh Is the calculated heavy-atom structure factor. 

j) TMLA, tr} methyl lead acetate . A crystal was soaked In 1 mM CHgHgCI for 20 h and than transferred to 10 mM TMLA solution to be soaked f * 
72 h. 





PIG. 1 Electron density maps with the refined model superimposed, i, 
Solvent-flattened multiple isomorphous replacement (MIR) map at 33 A 
resolution; 6, 2.4 A-resotutlon deletion map calculated using 2F C - * r e 
as amplitudes and using phases calculated from the structure omitting 
the portion of the model shown. 
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FIG. 2 Taq polymerase structure and 
comparison with that of KF. «, Helix-and- 
arrow overaD schematic ribbon represen- 
tation of Taq pofyrr erase drawn using 
rendered MOLSCRIP" 2 * 20 . a-Heltoes are 
represented as hefi<:al ribbons and 0- 
strends as arrows. Helices and strands 
are lettered and nunbered as In KF for 
the 3'~6' exorwcleaie and polymerase 
domains and tetterec and numbered with 
primes in the 5' rucleass (previously 
called 5'-3 p exonuctease) domain. The 5' 
nuclease domain at tie N terminus Is 
orange and yellow: Its active site fs 
marked by a red Zn 2 ' and two blue Mn a * 
Ions. The portion of this domain whose 
side chains have no\ been positioned is 
yellow (residues 172- 234 and 1-12). The 
vestigial 3'-5' exonui lease domain Is red 
and the polymerase domain is divided 
into green thumb, blue finger and purple 
palm subdomains. The active site 
Asp 610, Asp 785 an<l Glu 766 are In dark 
green, b, Superposition of KF and Taq 
polymerase. Stereo o: Co backbone of the 
KF polymerase do Train (thin bonds) 
superimposed on the corresponding 
atoms (thick bonds} of the Tag polymer- 
ase, which are numbered. The three cata- 
lytic carboxyiate side chains are shown In 
ball and stick representation at the bot- 
tom of the deft C, Superposition Of 131 
a -carbon atoms of trie 3'-5' exonuclease 
domain of KF (thin t onds) on the corre- 
sponding atoms of lac polymerase. The 
four catalytic carboxylases In KF are 
shown, d, Structure based alignment of 
the sequences of thu y-S' exonuclease 
domain of KF on the corresponding 
domain of Taq polymerase. The unaligned 
residues are shown a 3 dots in E. co// poly- 
merase I (Ec) and the unpaired missing 
residues are shown e s blank In Taq poly- 
merase (T.aq). The anlno-add sequence 
numbers of Taq DNA polymerase second- 
ary structure elements are as follows: 5' 
nuclease domain: l'<3~7) 2'<12-17) 
A'ilS-29) B , (42-57) 3'(60-67> C(91- 
106) 4'(108-113) D(119-132) 5'(134- 
139) E , (143-148) 6(175-178) P(179- 
183) G'{189-198) H{20£-213) h'(217- 
221) rX225-232) j;235-246) K'(266- 
276) 1/(281-289); 3-5' exonuclease 
domain: l(294->-29d> A(-) 2(305-312) 
3(322-328) 4(330-336) 8(338-344) 
5(347-351) C(353-362> 5a(36ft-372) 
D(373-380) E{387-394) F<402-422): 
polymerase domain: 3(424-447) 6(448- 
452) H(453~477) Ha{4S7^496) Hb(515- 
521) 1(527-652) 7(5E 9^598) 8(572-676) 
J(581-564) K(589-598) 9(603-613) 
L(614~623) M(626-634) N(638-646) 
0(656-670) Oa(675»723) 0b(688-699) 
P(702-717) 10(718-723) 11(724^729) 
0(740-774) 12(776-784) 13(785-792) 
R(795-810) 14(816-825) Rs<82&-831). 




T.aq. 292 taleeHpw. .. .eQKFVt3TO£RKBauU£UUWWKZ3 330 

B.C. 327 ^yvtil...tftfTWlllWllWfKiaaiaOaMg 377 

T.aq. 331 HftK%> e»kadrdl heaKZUKTX 356 

B.c. 379 VNtflp. . .srecaLell . ^.aUflJGEJUWD 424 

T.aq. 357 SVLALRHJjaP FGnBMIAmjQP SWni«W»Wffl|Be«te 400 

E.c. 425 K3li«WISLR.IAEin>ILHSmie. .CmiCaAB&UtHktl '471 

T.aq. 401 HCBWtt^&HWKHIKE 423 

E.c. 497 V*mxamXJMJl3MHUJM 519 



614 
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RG. 3 The structure of the 5' mclease domain 
of rag polymerase, a, Overall rittbon drawing of 
the 5' nuclease domain drawn -with rendered 
MOLSCRIPT 3 ** 9 * the missing' residues in disor- 
dered loops are denoted by das. to. Stereo a- 
carbon drawing of the 5' nuclease domain with 
the conserved carboxylates shewn In ball and 
stick representation and the three metal ions 
shown as spheres, c, Close-up of the 5' nuclease 
active site showing the positions of the metal ion 
llgands as positioned In the apo enzyme. Differ- 
ence electron density maps between the apo 
enzyme and crystals soaked In 1 mM Zn 2+ and 
crystals soaked separately In 20 mM Mn 2+ show 
the positions of one ZrT + and two Mn 2 * ions. 
Presumably the llgands reorient slighuy to make 
optimal Interactions with the metal Ions, but the 
complex structures have not yet teen refined. 





Si 



be superimposed on the corresponding atoms in the 131-residue 
domain in the Taq enzyme tt> pvc an r.nxs. difference of 1 .6 A. 
One major difference in the overall structure of the 3-5' domain 
of Taq polymerase as compared with that of KF is the deletion 
of four loops of lengths between 8 to 27 residues. In KF, these 
loops pack together on one side of the 3-5' exo nuclease domain 
(Fig. 2b). Furthermore, all four of the carboxylates (D424, 
D501, D355, E357) known to be essential for divalent metal 
binding and catalysis in the )'~5' exonuclease domain of KF 3 - 4 
have been replaced by residues incapable of binding metal ions 
(L356, R405, G308, V310) in the vestigial 3'-5' exonuclease 
domain of Taq polymerase. Although (he 3-5' exonuclease cata- 
lytic site has been destroyed uid the size of the domain reduced, 
the contact with the pol dona in and the distance between the 
polymerase and 3-5' exonuclease domains remains similar in 
the two homologous polymerases. 

The 5'-nucJease domain forms a structure that is separate from 
the other two domains (Fig. with only 850 A 2 of -surface area 
in contact with the 3-5' exonuclease domain, consistent with 
this domain's ability to function after its proteolytic removal 
from the rest of the protein (J. B. Dahlberg, personal communi- 
cation,. and rcf. 5). It has a disep cleft that contains at its bottom 
the. conserved carboxytates shown here to Kgate divalent metal 
ions. A central p-sheet ties M the heart of the domain and is 
flanked on both sides by assemblies of five and six a-helices 
which form the walls of the ictive-site cleft (Fig. 3). 

Alignment of the amino- and sequences of six 5' nuclease 
domains from ON A polymerases in the pol 1 family show six 
highly conserved sequence motifs containing ten conserved 
acidic residues*- Seven of these residues (Asp 18, Asp 67, 
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Glu 1 17, Asp 119, Asp 120, Asp 142 and Asp 144) cluster with .n 
a sphere of 7 A radius, two (Asp 18ft and Asp 1 91 ) he in a regies 
built as polyalanine, and one (Glu 76) occurs in a completely 
disordered loop. Crystallographic data from crystals soaked d 
divalent metal ions show that some of these carboxytates serve o 
Kgate as many as three divalent metal ions (Fig. 3c). A difference 
Fourier using data from crystals soaked in 20 mM Mn 1+ shorts 
two peaks, corresponding to a metal-ion site III that Is inter ac t- 
fng with Glu 117, Asp 120, and possibly Asp 119, and a met; I- 
ion site II that is interacting with Asp 1 42 and Asp 144. Soaking 
crystals in I mM Zn 2+ reveals a divalent metal-ion-binding sire 
I, whose ligands appear to be Asp 18, Asp 119, Asp 142 aid 
perhaps His 21. Electron density maps of the apo protein shew 
some density at metal-ion site I, which may indicate binding >f 
a partially substituted Zn 2 * ion. Sites I and II are separated hy 
about 5 A, whereas these two sites are each about 10 A from 
site m. 

As we do not yet have the structures of either substrate »f 
product complexes with the enzyme, a firm mechanism for . 
nuclease reaction 7 cannot be proposed. However, a mechanism 
of phospboryl transfer, is becoming apparent in an increasing 
number of enzymes in which two divalent metal ions sue 
involved. This two-metal-ion mechanism was suggested initially 
for the 3-5' exonuclease of KF 8 and is supported by structur \ 
mutagenic and kinetic studies 3 ' 4 ^ There is evidence that tie 
enzymes alkaline phosphatase 10 , pyrophosphatase 1 1,1 \ RNase H 
(refs 13, 14) and polymerase, to name a few, use a similar mech a- 
nisrn, as may' 5- other enzymes containing conserved, caiA- 
lytically essential carboxylates (such as ruvC 18 and H V 
integrase 19 ) 30 . In this mechanism, the two metal ions ere 

6:,B 
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generally 4 A apart, interact directly with the scissile phosphate, 
stabilize the pentacovsikiit intermediate, and generate the attack- 
ing hydroxide ion, as well as facilitating the departure of the 3' 
oxyanion 8 . If such a tvo-metal-ion mechanism is relevant to the 
5' nuclease, the possible role, if any, of the more distant site III 
metal ion can only be guessed. 

Tfao source of the thermal stability of Taq polymerase is not 
obvious from structural comparison with KF, but the number 
of hydrogen bonds htis increased by four, and two sah bridges 
between subdomains n the polymerase domain become hydro- 
phobic; the ratio of hucice to isoleucine has increased by 4.4- 
fold, and argintne to lysine by 1.3-fold, which may result from 
the higher G+C cor tent of the leucine and arginine codons 
(giving a more thermc«lable DNA), rather than on effect on the 
protein. 

An important quest .on concerning the pol I family of enzymes 
is how the polymerase and 5'-nuclease active sites work together 
to generate a duplex DNA product containing only a nick; the 
present structure raise? at least as many questions as it answers, 
because we observe tlat these two active sites are separated by 
over 70 A. The unusuUIy elongated shape of: the molecule seen 
here led us to examine its overall fold in solution. Preliminary 
measurements of the radius of gyration (R g ) of Tag polymerase 
by solution X-ray-scattering methods yield an experimental 
value of that is substantially smaller than that calculated from 
the coordinates of the crystal structure (S.H.E. et al, unpub- 
lished observations). Thus the 5' nuclease domain is not posi- 
tioned in solution as shown in Fig. 2, but must be located much 
closer to the centre of mass of the Stoflel fragment. Presumably 
its orientation in these crystals is adventitious and governed by. 
crystal-packing interactions. Two packing interactions between 
the 5' nuclease and neighbouring molecules bury 1,100 and 
1,466 A 2 of solvcnt-t accessible area, larger than the intra- 
molecular interaction surface. A structural basis for under- 
standing how these two activities work together must await the 
crystal structure of a complex with the appropriately nicked 
DNA substrate. □ 



ERRATUM 



Crystal structure of a replication 
fork single-stranded DNA binding 
protein (T4 £p32) complexed to DNA 

Youelff Sbamoo, Alan M. Friedman, Marie R. Parwns, 
William K. KonJ/Jsberg a Thomas A. Stettz 

Nature 37$, 362-366 (1995) 



An error in the production process resulted in Fig. la and b of 
the paper by Kim eV al, on page 613 of this issue being substituted 
for Fig. la and b of the earlier paper by Shamoo e at The 
correct panels of Fig. 2 are shown here. 
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RG. 2 6, Experimental electron density map to3iA oontmired at 2 A c and 
calcukited using tha combined MAD and Mffi phases that heve bee* sorvefit 
ftattenod. The cogrdinetion of the Zn** ion (yellow) is tetranedrai wffi His 64, 
Cys 77, Cys 87 and Cys 90 es Igandft t>, 2F* -F 0 atsctron densRy map contoured 
at 13 or showing a stretch of ^strand 4 thatlncludes the current party refined 
model end ell the data to 2.2 A. 
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Cloning and Analysis of the DNA Polymerase-encoding 
Gene from Thermus fdiformis 

vScung Eiin Jung, Jcong Jin Choi, Uyun Kyu Kim and Suk-Tae Kwon* 

Department of Genetic Engineering, Sung Kyun Kwan University, Sitwon 440-746, Korea 

(Received on August 14, 1997) 

The gene encoding Thermus fdiformis (Tfi) DNA polymerase was cloned and its nucleotide 
sequence was determined. The primary structure of Tfi DNA polymerase was deduced 
from its nucleotide sequence. Tfi DNA polymerase is comprised of 833 amino acid residues 
and its molecular mass was determined to be 93,890 Da. The deduced amino acid 
sequence if Tfi DNA polymerase showed a high sequence homology to E. coli DNA poly- 
merase I-like DNA polymerases: 78.5% homology to Taq DNA polymerase, 78.4% to Tea 
DNA potj merase, and 41.8% to E. coli DNA polymerase L An extremely high sequence 
identity was observed in the region containing polymerase activity. The G + C content of 
the coding region Tor the Tfi DNA polymerase gene was 685%, which was higher than 
that of the chromosomal DNA (65%). The G+C contents in the first, second, and third 
positions ofthe codons used were 71.8%, 40.9%, and 92,7% respectively. Codon usage in 
Tfi DNA polymerase was heavily biased towards the use of G+C in the third position. 
Rare codens with U or A as the third base were sometimes used to avoid using GA(A/T) 
TC and TCGA sequences, as they are recognition sites for the restriction endonucleases 
7771 and TaqL 



DNA polymerase is one of the most important en- 
zymes for DNA repiiir and replication in living cells. 
Many different DNA polymerase genes have been 
cloned and sequenced. Their deduced amino acid se- 
quences have been reported from nucleotide sequence 
data (Joyce et at., 1982; Lawyer et a/„ 1989; Lopez 
et al. % 1989). Hie amino acid sequences of these 
DNA polymerases have been ab'gned and partial 
homologous regions have been identified (Bemad et 
al. y 1989; Bianco e, aL, 1991; Ito and Braithwaite, 
1991). On the basis of segmental similarities in the 
amino acid sequences, DNA polymerases have been 
classified into two major groups represented by £. 
coli DNA polymer&ie l-like prokaryotic DNA poly- 
merases and DNA polymerase a-like prokaryotic and 
eukaryotic DNA polymerases (Bemad et at., 1989; 
Blanco et al. t 1991). A classification of DNA poly- 
merases into families A, B, and C according to the 
homology of the amino acid sequence with E. coli 
DNA polymerase 1, (L and III, respectively, has been 
proposed (Ito and Br.iithwaite, 1991). 

We are interested in cloning genes coding for ther- 
mostable DNA polymerases, which are useful for poly- 
merase chain reaction (PCR). Recently, PCR has be- 
come a powerful method for the identification and 
amplification of genes, their direct sequencing, and 
clinical diagnosis. The thermostable DNA polymerase 
is the key ingredient of PCR. Early experiments used 

* To whom correspondence should be addressed. 



the thermolabile Klenow fragment, which had o be 
added every cycle (Saiki et qL, 1985). Thi in- 
troduction of thermostable DNA polymerase alhwed 
the automation of the process (Saiki et at., 1988) Ac- 
cordingly, thermostable DNA polymerase was nuch 
more stable and suitable in thermocycles during PCR 
(Erlich, 1989; Saiki et at., 1988). 

The purification procedures and properties of ther- 
mostable DNA polymerases have been reporte«i for 
thermophilic bacteria in the genus Thermus such as T 
aquaticus YTM (Chien et al., 1976), T ruber 
(Kaiedin et aL, 1980), T flavus (Kaledin et at., US2), 
T. thermophilics HBS (Ructtimann et al., 1985) and 
T. caldophitus GK24 (Park et al, 1993). However, 
no information is available on properties of ther- 
mostable DNA polymerase from T. filiformis. 

T. filiformis was isolated from a New Zealanr hot 
spring and was described as a member of the j:enus 
Thermus by Hudson et aL (1987). T filiform^ al- 
ways forms long filaments consisting of chair s of 
cells and so can be distingushed from other Thermits 



The abbreviations used are: PCR, polymerase chain reac- 
tion; Taq DNA polymerase, DNA polymerase isolated 
from Ttiermus aquaticus YT-I; Tea DNA polymerase, 
DNA polymerase isolated from Thermus caldophitus CK24; 
Tfi DNA polymerase, DNA polymerase isolated from 
Thermits filiformis. The nucleotide sequence data reported 
in this paper will appear in the GenBank nucleotide se- 
quence databases with the accession number AF03032< . 

© 1997 The Korean Society for Molecular Biology 
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strains in morphology. 

In (his paper we report (i) the cloning of the gene 
for Tfi DNA polymerase, (ii) the nucleotide sequence 
of the Tfi DNA polymerase gene and its deduced ami- 
no acid sequence, (iii) comparison of the amino acid 
sequence of Tfi DNA polymerase with those of other 
E. coli DNA polymerase [-like DNA polymerases, 
and (iv) the analysis of the gene. 

Materials and Methods 

Bacterial strains and culture conditions 

T. filiformis (ATCC 43:>80) (Hudson et al> 1987) 
cells were prepared as described by Ramaley and Hix* 
son (1970). E coli strain MV1184 (Sambrook et al % 
1989) was used as the host for plasmid prepartions 
and was grown in LB medium supplemented with 
0.1% glucose. Ampicillin (100 Hg^ml) was added 
when needed. The E. coli cells were grown at 37 T3. 
Plates were solidified with 1.5% agar. 

Enzymes and reagents 

Tfi restriction endonuclsase was purchased from 
New England Biolabs, Ire. T4 DNA ligase, poly- 
nucleotide kinase, DNA rrolecular weight marker X, 
and other restriction enzymes were purchased from 
Boehringer Mannheim GmbH. Taq DNA polymerase 
was prepared as describee previously (Kwon et al. % 
1991). An oligo labeling kit and radioactive nu- 
cleotides were purchased from Arnersham, and Deaza 35 
Sequencing™ Mixes, plasmids DUC18/19 (Norran- 
der et al., 1983), and pBlu;script*n SK+ /- (Alting- 
Mees and Short, 1989; Short et at., 1988) were pur- 
chased from Pharmacia LKB Biotechnology, Inc. Oth- 
er reagents were obtained from Sigma. 

Molecular cloning and DNA hybridization techniques 

Most of the methods used for molecular cloning 
were based on those of Sambrook et al. (1989). E. 
coli MV1184 was mainly jsed as a host for plasmid 
preparations. Chromosomal DNA of T. filiformis was 
isolated by the method ol Marmur (1961). Plasmid 
DNA was prepared by a modified alkaline extraction 
method (Sambrook et al., 1989). The transformation 
of E. coli was performed as described by Hanahan 
(1983) and Kushner (1973). DNA was- labeled by 
nick-translation according to Rigby et al. (1977). The 
DNA probe used for the E»NA-DNA hybridization to 
detect the Tfi DNA polymerase gene was the 1.8 kb 
ffindm fragment from pKTPOLlO containing the 
Taq DNA polymerase gene (Kwon et al., 1991). 
Agarose gel membrane hybridization was performed 
by the method of Silhavy et al (1984). Colony hy- 
bridization was performed by the method of Hanahan 
and Meselson (1980). 

DNA sequencing and computer-assisted analyses 

The restriction fragmerrts to be sequenced were 
cloned into appropriate resection sites of pUC18/19 



and pBluescript^n SK-f/- vectors. DNA sequenc- 
ing by the dideoxynucleotide chain-termination 
method was performed according to Hattori and 
Sakaki (1986) using an alkali-denatured plasmid 
DNA as the template and universal primer. Sequence 
data was analyzed using PCGENE and DNASIS as 
DNA analysis programs. 

Results and Discussion 

Cloning of the Tfi DNA polymerase gene 

To clone the Tfi DNA polymerase gene, the struc- 
tural gene coding for Taq DNA polymerase was used 
as a hybridization probe (Kwon et al t 1991). Chro- 
mosomal DNA prepared from T. filiformis was di- 
gested with restriction enzymes, followed by separa- 
tion by 0.8% agarose gel electrophoresis. The agarose 
gel was dehydrated, and agarose^membrane hy- 
bridization was performed using 32 P-labeled Taq 
DNA polymerase gene. The probe hybridized to both 
an approximately 4.8 kb and a 2.2 kb BamHl frag- 
ments, an approximately 12 kb Hindlll fragments, 
and both an approximately 6 kb and a 4.5 kb Pstl 
fragments (Fig. 1). Two BamHl fragments were suit- 
able for cloning, because the BamHl fragments was 
smaller than Pstl and Hindlll fragments. Accordingly, 
T. filiformis DNA (100 was digested with BamHl 
and then electrophoresed on a low-melting agarose 
gel. The resulting DNA fragments were separately 
collected from the regions containing the 4.8 kb and 
2.2 kb BamHl fragments. The 4.8 kb and 2,2 kb 
BamHl fragments were separately ligated at the 
BamHl site in the multiple cloning sites of plasmid 
vector pUC18, and then E. coli MV1184 was 
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Figure 1. Southern blot analysis of T. filiformis DNA di- 
gested with restriction endomicleases. The DNA probe 
used for DNA hybridization is the 1.8 kb //iwdlll fragment 
from pKTPOLlO containing the Taq DNA polymerase 
gene (Kwon et al. t 1991). A) EtBr stained agarose gel. B) 
Southern blot hybridization with DNA probe. Lane 1, 
DNA molecular weight marker X (0.07-12.2 kbp); lane 2, 
tfaffiHI-digested genomic DNA; lane 3, Wi/idlH-digesied 
genomic DNA; lane 4, i»srt-digested genomic DNA. 



Fig 
DN 
stri 
of. 
Sa< 
&cr 



Ira 
bri 
res 
pai 
pis 
pT 
wr 
tai 
be) 



Th 
sia 
sto 



PACE 35/41 * RCVD AT 5/16/2006 8:35:52 PM [Eastern Daylight Time] * SVR:USPTO-EFXRF-3/2 * DNIS:2738300 * CSID:415 576 0300 * DURATION (mm -ss): 3 0-1 6 



05/16/2006 16:58 FAX 415 576 0300 



0036/041 



Vol. 7 (1997) 



Seung Eun Jung et al. 



S PSBS 



-ULUi 



sa 
_L| 



7IK6> 



pTFPL pTFPS 

Kgure 2, Restriction m* t p and positions of the cloned 
DNA fragments in plasnid pTFPL and pTFPS. The rc- 
striciion sites used for th« subclomng and DNA sequencing 
of cloned DNA fragments are shown; B, BamW\\ P, Pst\\ S, 
Sad; X, Xhol. The poshion of the Tfi DNA polymerase 
gene in the cloned fragments is indicated by the open arrow, 

transformed with the plasmids. After colony hy- 
bridization, more than 16 clones showed a prominent 
reaction with the prat*. Plasmid DNAs were pre- 
pared from these clones, and finally two kinds of 
plasmids* named pTFPL and pTFPS, were obtained. 
pTFPL contained the 4.8 kb Bamlft fragment to 
which the labeled probs hybridized, and pTFPS con- 
tained the 2.2 kb BamW fragment to which the la- 
beled probe hybridized Tig. 2). 



Nucleotide sequence of the Tfi DtfA polymerase g<>ne 
and its deduced amino acid sequence 

The restriction maps of the 4.8 kb and 2.2 kb 
BamHi fragments are presented in Figure 2. Each en- 
zyme site of the restriction maps was used for .he 
subcloning and DNA sequencing of cloned DNA frag- 
ments. Both strands of the subclones containing he 
gene of Tfi DNA polymerase were sequenced. The po- 
sition of the Tfi DNA polymerase gene in the cloiied 
fragments is indicated by Ihe open arrow. Figure 3 
shows ihe nucleotide sequence of the DNA and he 
deduced amino acid sequence of 77? DNA poly- 
merase. Tfi DNA polymerase was comprised of €33 
amino acid residues and its molecular mass was de- 
termined to be 93,890 Da. 

The amino acid composition of Tfi DNA poly- 
merase, calculated from the deduced sequence, is 
shown in Table 1 and is compared with those of oil- 
er £. coli DNA polymerase Mike DNA polymerases. 
The amino acid composition of Tfi DNA polymer; -se 
is similar to that of other enzymes. Thermophilic or- 
ganisms cannot regulate their, internal temperature. 
Consequently, thermophilic organisms must possm 



-120 fr^w^T^TAG/ ATsoccg/iare 

lit T¥^FFW A rTYW 
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Figure 3. Nucleotide seqience of the cloned DNA fragments and deduced amino acid sequence of Tfi DNA polymerase 
The numbering of mc\co\ des starts at the S'-terminus of the gene encoding Tfi DNA polymerase, and that of amino adds 
start at the NhMerminus of Tfi DNA polymerase, A putative Shine-Dalgamo sequence is underlined. Asterisks indicate lae 
stop codon. 
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Table 1. Amino acid composition of Tfi DNA polymerase 
in comparison with those of other DNA polymerases 



Amino 


Tfi DNA 


Tag DNA 


Tea DNA 


E. coli DNA 


acid 


polymerase polymerase polymerase polymerase I 


Ala 


82 


91 


89 


99 


Arg 


77 


76 


68 


45 


Asn 


14 


12 


18 


32 


Asp 


47 


4.9 


*r 1 


CI 

Ji 


Cys 


0 


0 


1 


2 


GJn 


19 


16 


22 


39 


Glu 


84 


87 


83 


80 


Gly 


55 


58 


54 


57 


His 


18 


18 


21 


21 


lie 


15 


25 


16 


53 


Leu 


128 


124 


127 


106 


Lys 


39 


42 


45 


59 


Met 


14 


16 


16 


25 


Phe 


34 


27 


29 


24 


Pro 


50 


48 


51 


50 


Ser 


32 


31 


30 


39 


Thr 


35 


30 


31 


50 


Trp 


9 


14 


12 


7 


Tyr 


19 


24 


20 


32 


Val 


62 


51 


60 


57 


Total 


833 


832 


834 


928 


A#, 


93,890 


93,922 


93,810 


103,130 



intrinsically thermostable cellular enzymes. A number 
of reports have been written on the enhanced stability 
of thermophilic enzymes (Thomas and William, 
1986). The thermostability of an enzyme is a basic 
function of the enzyme's stabilizing forces. These in- 
clude hydrophobic interactions, disulfide bridges, ion- 
ic interactions, hydrogen binding, and metal binding. 
In their absence, destabilising forces arise from the 
conformational entropy of the protein. Each of these 
stabilizing forces, either by itself or in combination, 
has been suggested as a possibility for enhanced ther- 
mostability. However, Tfi DNA polymerase does not 
contain a disulfide bridge (Table 1). The ratios of hy- 
drophobic amino acid-composition between Tfi DNA 
polymerase and E. coli DNA polymerase J also showed 
some similarity, but the thermostability of two en- 
zymes was different. Therefore, the thermostability of 
Tfi DNA polymerase canrot be elucidated by com- 
parison of amino acid compositions. Tfi, Taq, and 
Tea DNA polymerases hsve lower Lys and higher 
Arg contents than E. coli DNA polymerase I, which 
is characteristic of enzymes derived from the genus 
Thermus (Kagawa et aL, IS'84; Kunai et aL, 1986). 

Comparison of the amino acid sequence of Tfi DNA 
polymerase with those of other E. coli DNA 
polymerase I-like DNA polymerases 

The whole amino acid sequence of Tfi DNA poly- 
merase showed a high homology to those of the £. 
coli DNA polymerase Mike DNA polymerases Taq 



DNA polymerase (Lawyer et aL, 1989), Tea DNA poly- 
merase (Kwon et ai t 1997), and E. coli DNA poly- 
merase I (Joyce et al, 1982) (Fig. 4). Extremely high 
sequence homology was observed in (he Tfi, Taq, 
and Tea DNA polymerase. Tfi DNA polymerase 
shows 78.5% homology to Taq DNA polymerase, 
78.4% to Tea DNA polymerase, and 41.8% to E. coli 
DNA polymerase I. 

In the case of £. coli DNA polymerase I, pro- 
teolytic cleavage separates the polypeptide chain into 
two active fragments; a smaller NH r terminal frag- 
ment containing the 5' — * 3' exonuclease activity and 
a large COOH-terminal fragment that contains poly- 
merase and 3 f — * 5' exonuclease activities (Derbyshire 
et al., 1988; Jacobsen et aL, 1974; Klenow and Hen- 
ningsen, 1970; OUis et aL, 1985). The NH r terminal 
regions of Tfi, Taq, and Tea DNA polymerase cor- 
respond to the NH 2 -terminal domain of E. coli DNA 
polymerase 1. In Tfi DNA polymerase, the first 254 
amino acids from the NH 2 terminus showed homo- 
logy to the 5 r 3' exonuclease domain of E. coli 
DNA polymerase L In agreement with this structural 
data, Tfi, Taq, and Tea DNA polymerases exhibit 5' 
— * 3' exonuclease activity. . The COOH-terminal re- 
gions of TfU Taq y and Tea DNA polymerase cor- 
respond to that of the E. coli DNA polymerase I con- 
taining DNA polymerase activity. As shown in Fig- 
ure 4, this region is conserved in most of the DNA 
polymerases, suggesting that this region corresponds 
to an evolutionarily conserved DNA polymerase 
domain (Blanco et aL, 1991). 

As a result of mutations, deletions, and sub- 
stitutions during evolution, Tfi DNA polymerase resi- 
dues at positions 255-433 show little sequence sim- 
ilarity to the E. coli DNA polymerase I domain (at 
positions 261-529) assumed to contain the 3' — * 5' exo- 
nuclease activity (Derbyshire et aL % 1988; Ollis et aL, 
1985). In this region, the amino acid sequence of Tfi 
DNA polymerase showed a especially high homology 
to those of Taq and Tea DNA polymerase, but £. 
coli DNA polymerase I had a highly different struc- 
ture and showed little similarity to the others. Tfi 
DNA polymerase is 95 residues shorter than E. coli 
DNA polymerase I because most of the deleted resi- 
dues occur in (he region encompassing residues 255- 
433. In E. coli DNA polymerase, this domain struc- 
turally contains the 3' — + 5* exonuclease active site. A 
common feature of many DNA polymerases is a 3 1 — ► 
5' exonuclease activity that is partly responsible for 
the high fidelity of DNA replication (Kunkel, 1988). 
This evohitionarily conserved active site is mainly 
formed by the highly conserved regions Exol, £xoll> 
and Exolll (Blanco et aL, 1991) as shown in Figure 4. 
However, the regions of 37? DNA polymerase cor- 
responding to the highly conserved regions Exo\, Exo~ 
H, and ExolU of E. coli DNA polymerase I did not 
exist. Therefore, it is reasonable to believe that Tfi, 
Taq, and Tea DNA polymerases do not possess as 
much 3' — ► 5* exonuclease activity, as E. coli DNA 
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polymerase I. Actually, 3' — 5' exonuclcase activity 
cannot be detected in Lhe purified Tea DNA poly- 
merase (Park et <*/., 19!>2). 
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Analysis of 5'- and 3'-noncoding regions of the Vca 
DNA polymerase gene 

Analysis of the gene that codes for Tfi DNA pcly- 




Tfi T )t R 

Taq T3Q K 

Tea tjo K 

Eco a» GAfti 

Tfi TW |^AK 

Taq T96 Sfi&A 
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Comp3rlSOn ° f ~? C anUn0 acid ""P 16 ** of # DNA polymerase wilh those of other £. art DNA polymerase 
like DNA polymerases. .Tie sequence of Tjff DNA polymerase (Tfi) is shown as compared with those of Taq DNA 
polymerase fTaq) (Uwye. *r al., 1989), Tea DNA polymerase (Tea) (Kwon « 1997). and £. co/i DNA polymcra:-c 
rT.? r5f^f u tf ' - 1 ^ 2) * IdcntlcaI amino betwecn & DN * polymerase and others are indicated by stippled boxci. 
Dark-shaded boxes inchoate the three highly conserved regions Exo\ t ExoU, and ExoHI, proposed to form a general T — >< 
exonuclease-acnve she (Blutco et al., 1991). * 
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merase reveals other interesting features. In the 5-non- 
coding region of the gene, the 5-GAGG-3 1 segment 
at position - 6 to -3 upstream of the translation 
start codon (ATG for Met) complements the 3' end of 
the 16S rRNA of E. coli <Shine and Dalgarno, 1975) 
and resembles a ribosome binding site. 

[Except tor the promotei sequence of the T. flavus 
succinyl-coenzyme A synthetase-malate dehydro- 
genase (Nishiyama et al. f 1991), the promoter-like se- 
quence in the -35 region and the -10 region, 
which can function in E. coli, was not found in the 
most of the genes derived from genus Thermus 
(Kunai et a!., 1986; Kwon el aL, 1997; Lawyer et al. y 
1989). The promoter-likt sequence was not also 
found in the upstream flanking region of the Tfi 
DNA polymerase gene. 

In die 3'-noncoding region of the gene, there was 
no potential transcriptionai termination sequence able 
to form a stem-and-loop structure followed by a py- 
rimidine-rich sequence (Fi£. 3). 

High G+C content of the third positions in the codon 
usage 

The G+C content of the Tfi DNA polymerase 
gene was 68.5%, slightly higher than that of chro- 
mosomal DNA (65%) (Hudson et al. y 1987). The G+ 
C content in the first, second, and third positions of 
the codons used were 78.1%, 40.9%, and 92.7% 
respectively. Codon usag; in Tfi DNA polymerase 
was heavily biased towards the use of G and C in the 
third position, as expected for an organism with G-f 
C rich DNA (Table 2). Essentially identical third- 
position codon bias has been observed for other 



Thermus genes: 93% G+C in third position for the 
Tea DNA polymerase gene of T. caldophilus GK24 
(Kwon et aL, 1997), 91.8% for the Taq DNA poly- 
merase, gene of T. aquaticus YT-1 (Lawyer et aL, 
1989), 94.8% for maiate dehydrogenase gene of T. 
flavus AT-62 (Nishiyama et al., 1986), and 89.4% for 
the isoprapylmalate dehydrogenase gene of T. ther- 
mophilic HB8 (Kagawa et aL, 1984). Codons, the 
third positions of which are U or A, are thus rarely 
used, and only 61 such codons were observed among 
834 codons in the Tfi DNA polymerase gene. 

The codons of Arg, Ala, Pro, and Gly generally 
raise the G+C content of DNA, while those of Lys, 
He, Met, Tyr, Asn, and Phe raise the A+T content 
(Table 2). TTie amino acid compositions of DNA poly- 
merases from T. filiformis and £. coli arc shown in 
Table 1. Specially, Arg content was much higher in 
the Tfi DNA polymerase gene than in the E. coli 
DNA polymerase 1 gene. On the other hand, Lys, lie, 
Met, Tyr, and Asn levels were much lower in the Tfi 
DNA polymerase gene than in the E. coli DNA 
polymerase I gene. These changes in amino acid com- 
position increase the G+C content of the DNA. 

Special codon usage avoiding the GA(A/T)TC (TfU 
site) and TCGA sequence (Taq/ site) 

Chromosomal DNA from T. filiformis was not di- 
gested by Tfil restriction endonuclease at all. There 
was no nucleotide sequence of GA(A/T)TC (Tfil re- 
cognition site) in the sequenced region of T. filiformis 
DNA (2^02 nucleotides) (Fig. 3). There was also no 
nucleotide sequence of TCGA {Taql recognition site) 
except for three nucleotide sequences of CTCGAG 



Table 2* Codon usage in the gene for Tfi DNA polymerase in comparison with that in ihe gene for Taq and Tea DNA poly- 
merases 
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Tfi, Tfi DNA plyinerase; Taq, Taq DNA polymerase; Tea, Tea DNA polymerase. 
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(-V/ioI site) and one r ucleotide sequence of TTCGAC 
in the sequenced region of T. filiformis DNA (2,502 
nucleotides) (Fig. 3) TCGA is (he recognition se- 
quence for the restriction endonucleases Taql and 
TthHBSl (Barany et .7/., 1992), which have been pu-. 
rificd from T. aquaticus YT-1 and T. thermophilus 
HB8, respectively. Tie Taql site was not present in 
the DNA sequences of various Thermits chro- 
mosomes (Kunai et al, 1986; Kwon et al, 1988). 
Chromosomal DNA from T. filiformis was not also 
digested by Taql and TthHBSl at all, suggesting that 
T. filiformis has the same host restriction and mod- 
ification system as ether Thermits species. A DNA 
adenine methylase fom T. thermophilus HB8 has 
been reported (Sato et al, 1980). This enzyme re- 
cognizes sequences of CTCGAG and TTCGAC in 
Thermits cells, and the methylated sequence of 
TCG*A cannot be use d as a substrate for Taql. 

We have examined the numbers of NCGA, TNGA, 
TCNA, and TCGN sequences, which are sequences 
similar to TCGA (7agl recognition site). In the 
sequenced region of T. filiformis DNA (Fig. 3), the 
numbers were 33, 53, 35, and 26 respectively. We. 
have also examined tae numbers of NA(A/T)TC, GN 
(A/T)TC, GANTC, GA(A/T)NC, and GA(A/I^TN se- 
quences, which are s:quences similar to GA(A/T)TC 
(Tfil recognition site) In the sequenced region of the 
7. filiformis DNA (Fi,*. 3), the numbers were 0, 11, 4, 
11, and 2. respectively. This data suggests that the 
TCGA and GA(A/T)TC sequences are avoided in T 
filiformis. 

Avoiding the sequences of GA(A/T)TC and TCGA 
sometimes results in I he usage of rare codons (A or T 
in the third position) in the Tfi DNA polymerase 
gene. ITiere were only 22 rare codons, the third bases 
of which are U or A, in the 77? DNA polymerase 
gene (Fig. 3, Tabic; 2). We are conducting ex- 
periments to express he Tfi DNA polymerase gene in 
E. colt. 
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